61 research outputs found

    Message Passing in C-RAN: Joint User Activity and Signal Detection

    Full text link
    In cloud radio access network (C-RAN), remote radio heads (RRHs) and users are uniformly distributed in a large area such that the channel matrix can be considered as sparse. Based on this phenomenon, RRHs only need to detect the relatively strong signals from nearby users and ignore the weak signals from far users, which is helpful to develop low-complexity detection algorithms without causing much performance loss. However, before detection, RRHs require to obtain the realtime user activity information by the dynamic grant procedure, which causes the enormous latency. To address this issue, in this paper, we consider a grant-free C-RAN system and propose a low-complexity Bernoulli-Gaussian message passing (BGMP) algorithm based on the sparsified channel, which jointly detects the user activity and signal. Since active users are assumed to transmit Gaussian signals at any time, the user activity can be regarded as a Bernoulli variable and the signals from all users obey a Bernoulli-Gaussian distribution. In the BGMP, the detection functions for signals are designed with respect to the Bernoulli-Gaussian variable. Numerical results demonstrate the robustness and effectivity of the BGMP. That is, for different sparsified channels, the BGMP can approach the mean-square error (MSE) of the genie-aided sparse minimum mean-square error (GA-SMMSE) which exactly knows the user activity information. Meanwhile, the fast convergence and strong recovery capability for user activity of the BGMP are also verified.Comment: Conference, 6 pages, 7 figures, accepted by IEEE Globecom 201

    Low-Complexity and Information-Theoretic Optimal Memory AMP for Coded Generalized MIMO

    Full text link
    This paper considers a generalized multiple-input multiple-output (GMIMO) with practical assumptions, such as massive antennas, practical channel coding, arbitrary input distributions, and general right-unitarily-invariant channel matrices (covering Rayleigh fading, certain ill-conditioned and correlated channel matrices). Orthogonal/vector approximate message passing (OAMP/VAMP) has been proved to be information-theoretically optimal in GMIMO, but it is limited to high complexity. Meanwhile, low-complexity memory approximate message passing (MAMP) was shown to be Bayes optimal in GMIMO, but channel coding was ignored. Therefore, how to design a low-complexity and information-theoretic optimal receiver for GMIMO is still an open issue. In this paper, we propose an information-theoretic optimal MAMP receiver for coded GMIMO, whose achievable rate analysis and optimal coding principle are provided to demonstrate its information-theoretic optimality. Specifically, state evolution (SE) for MAMP is intricately multi-dimensional because of the nature of local memory detection. To this end, a fixed-point consistency lemma is proposed to derive the simplified variational SE (VSE) for MAMP, based on which the achievable rate of MAMP is calculated, and the optimal coding principle is derived to maximize the achievable rate. Subsequently, we prove the information-theoretic optimality of MAMP. Numerical results show that the finite-length performances of MAMP with optimized LDPC codes are about 1.0 - 2.7 dB away from the associated constrained capacities. It is worth noting that MAMP can achieve the same performance as OAMP/VAMP with 0.4% of the time consumption for large-scale systems.Comment: 6 pages, 6 figures, accepted at GLOBECOM 202

    Capacity-Achieving MIMO-NOMA: Iterative LMMSE Detection

    Full text link
    This paper considers a low-complexity iterative Linear Minimum Mean Square Error (LMMSE) multi-user detector for the Multiple-Input and Multiple-Output system with Non-Orthogonal Multiple Access (MIMO-NOMA), where multiple single-antenna users simultaneously communicate with a multiple-antenna base station (BS). While LMMSE being a linear detector has a low complexity, it has suboptimal performance in multi-user detection scenario due to the mismatch between LMMSE detection and multi-user decoding. Therefore, in this paper, we provide the matching conditions between the detector and decoders for MIMO-NOMA, which are then used to derive the achievable rate of the iterative detection. We prove that a matched iterative LMMSE detector can achieve (i) the optimal capacity of symmetric MIMO-NOMA with any number of users, (ii) the optimal sum capacity of asymmetric MIMO-NOMA with any number of users, (iii) all the maximal extreme points in the capacity region of asymmetric MIMO-NOMA with any number of users, (iv) all points in the capacity region of two-user and three-user asymmetric MIMO-NOMA systems. In addition, a kind of practical low-complexity error-correcting multiuser code, called irregular repeat-accumulate code, is designed to match the LMMSE detector. Numerical results shows that the bit error rate performance of the proposed iterative LMMSE detection outperforms the state-of-art methods and is within 0.8dB from the associated capacity limit.Comment: Accepted by IEEE TSP, 16 pages, 9 figures. This is the first work that proves the low-complexity iterative receiver (Parallel Interference Cancellation) can achieve the capacity of multi-user MIMO systems. arXiv admin note: text overlap with arXiv:1604.0831

    Bridging the Granularity Gap for Acoustic Modeling

    Full text link
    While Transformer has become the de-facto standard for speech, modeling upon the fine-grained frame-level features remains an open challenge of capturing long-distance dependencies and distributing the attention weights. We propose \textit{Progressive Down-Sampling} (PDS) which gradually compresses the acoustic features into coarser-grained units containing more complete semantic information, like text-level representation. In addition, we develop a representation fusion method to alleviate information loss that occurs inevitably during high compression. In this way, we compress the acoustic features into 1/32 of the initial length while achieving better or comparable performances on the speech recognition task. And as a bonus, it yields inference speedups ranging from 1.20×\times to 1.47×\times. By reducing the modeling burden, we also achieve competitive results when training on the more challenging speech translation task.Comment: ACL 2023 Finding

    NutritionVerse: Empirical Study of Various Dietary Intake Estimation Approaches

    Full text link
    Accurate dietary intake estimation is critical for informing policies and programs to support healthy eating, as malnutrition has been directly linked to decreased quality of life. However self-reporting methods such as food diaries suffer from substantial bias. Other conventional dietary assessment techniques and emerging alternative approaches such as mobile applications incur high time costs and may necessitate trained personnel. Recent work has focused on using computer vision and machine learning to automatically estimate dietary intake from food images, but the lack of comprehensive datasets with diverse viewpoints, modalities and food annotations hinders the accuracy and realism of such methods. To address this limitation, we introduce NutritionVerse-Synth, the first large-scale dataset of 84,984 photorealistic synthetic 2D food images with associated dietary information and multimodal annotations (including depth images, instance masks, and semantic masks). Additionally, we collect a real image dataset, NutritionVerse-Real, containing 889 images of 251 dishes to evaluate realism. Leveraging these novel datasets, we develop and benchmark NutritionVerse, an empirical study of various dietary intake estimation approaches, including indirect segmentation-based and direct prediction networks. We further fine-tune models pretrained on synthetic data with real images to provide insights into the fusion of synthetic and real data. Finally, we release both datasets (NutritionVerse-Synth, NutritionVerse-Real) on https://www.kaggle.com/nutritionverse/datasets as part of an open initiative to accelerate machine learning for dietary sensing

    Segment Anything Model for Medical Images?

    Full text link
    The Segment Anything Model (SAM) is the first foundation model for general image segmentation. It designed a novel promotable segmentation task, ensuring zero-shot image segmentation using the pre-trained model via two main modes including automatic everything and manual prompt. SAM has achieved impressive results on various natural image segmentation tasks. However, medical image segmentation (MIS) is more challenging due to the complex modalities, fine anatomical structures, uncertain and complex object boundaries, and wide-range object scales. SAM has achieved impressive results on various natural image segmentation tasks. Meanwhile, zero-shot and efficient MIS can well reduce the annotation time and boost the development of medical image analysis. Hence, SAM seems to be a potential tool and its performance on large medical datasets should be further validated. We collected and sorted 52 open-source datasets, and build a large medical segmentation dataset with 16 modalities, 68 objects, and 553K slices. We conducted a comprehensive analysis of different SAM testing strategies on the so-called COSMOS 553K dataset. Extensive experiments validate that SAM performs better with manual hints like points and boxes for object perception in medical images, leading to better performance in prompt mode compared to everything mode. Additionally, SAM shows remarkable performance in some specific objects and modalities, but is imperfect or even totally fails in other situations. Finally, we analyze the influence of different factors (e.g., the Fourier-based boundary complexity and size of the segmented objects) on SAM's segmentation performance. Extensive experiments validate that SAM's zero-shot segmentation capability is not sufficient to ensure its direct application to the MIS.Comment: 23 pages, 14 figures, 12 table
    • …
    corecore